首页> 外文OA文献 >Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation

【2h】

Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation

机译：独立于说话人的深层模型的置换不变训练多讲者语音分离

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a novel deep learning model, which supports permutation invarianttraining (PIT), for speaker independent multi-talker speech separation,commonly known as the cocktail-party problem. Different from most of the priorarts that treat speech separation as a multi-class regression problem and thedeep clustering technique that considers it a segmentation (or clustering)problem, our model optimizes for the separation regression error, ignoring theorder of mixing sources. This strategy cleverly solves the long-lasting labelpermutation problem that has prevented progress on deep learning basedtechniques for speech separation. Experiments on the equal-energy mixing setupof a Danish corpus confirms the effectiveness of PIT. We believe improvementsbuilt upon PIT can eventually solve the cocktail-party problem and enablereal-world adoption of, e.g., automatic meeting transcription and multi-partyhuman-computer interaction, where overlapping speech is common.

机译：我们提出了一种新颖的深度学习模型，该模型支持置换不变训练（PIT），用于与说话者无关的多说话者语音分离，通常被称为鸡尾酒会问题。与大多数将语音分离视为多类回归问题的现有技术以及将其视为分割（或聚类）问题的深度聚类技术不同，我们的模型针对分离回归误差进行了优化，而忽略了混合源的顺序。该策略巧妙地解决了长期存在的标签置换问题，该问题阻碍了基于深度学习的语音分离技术的进步。丹麦语料库的等能量混合设置实验证明了PIT的有效性。我们认为，基于PIT的改进最终可以解决鸡尾酒会问题，并使现实世界能够采用自动会议转录和多方人机交互（其中重叠的语音很常见）。

著录项

作者
Yu, Dong; Kolbæk, Morten; Tan, Zheng-Hua; Jensen, Jesper;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [J] . Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第10期

机译：深度递归神经网络的话语水平置换不变训练的多说话人语音分离
2. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
3. Deep neural networks based binary classification for single channel speaker independent multi-talker speech separation [J] . Saleem Nasir, Khattak Muhammad Irfan Applied Acoustics . 2020,第Octa期

机译：基于深度神经网络的单通道扬声器独立多讲车语音分离二进制分类
4. Permutation invariant training of deep models for speaker-independent multi-talker speech separation [C] . Dong Yu, Morten Kolbæk, Zheng-Hua Tan, IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：与说话者无关的多说话者语音分离的深度模型的置换不变训练
5. 9- and 12-month-olds Fail to Perceive Infant-Directed Speech in an Ecologically Valid Multi-Talker Background. [D] . Bernier, Dana Elizabeth. 2014

机译：9个月和12个月大的儿童在生态有效的多语种背景下无法感知婴儿指导的语音。
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. Permutation invariant training of deep models for speaker-independent multi-talker speech separation [O] . Yu, Dong, Kolbæk, Morten, Tan, Zheng-Hua, 2017

机译：用于说话者无关的多讲话者语音分离的深度模型的置换不变训练

Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation

摘要

著录项

相似文献

相关主题

期刊订阅